Learning Semantic Representations for Nonterminals in Hierarchical Phrase-Based Translation

نویسندگان

  • Xing Wang
  • Deyi Xiong
  • Min Zhang
چکیده

In hierarchical phrase-based translation, coarse-grained nonterminal Xs may generate inappropriate translations due to the lack of sufficient information for phrasal substitution. In this paper we propose a framework to refine nonterminals in hierarchical translation rules with real-valued semantic representations. The semantic representations are learned via a weighted mean value and a minimum distance method using phrase vector representations obtained from large scale monolingual corpus. Based on the learned semantic vectors, we build a semantic nonterminal refinement model to measure semantic similarities between phrasal substitutions and nonterminal Xs in translation rules. Experiment results on ChineseEnglish translation show that the proposed model significantly improves translation quality on NIST test sets.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Learning Bilingual Distributed Phrase Representations for Statistical Machine Translation

Following the idea of using distributed semantic representations to facilitate the computation of semantic similarity between translation equivalents, we propose a novel framework to learn bilingual distributed phrase representations for machine translation. We first induce vector representations for words in the source and target language respectively, in their own semantic space. These word v...

متن کامل

Generalizing Hierarchical Phrase-based Translation using Rules with Adjacent Nonterminals

Hierarchical phrase-based translation (Hiero, (Chiang, 2005)) provides an attractive framework within which both shortand longdistance reorderings can be addressed consistently and ef ciently. However, Hiero is generally implemented with a constraint preventing the creation of rules with adjacent nonterminals, because such rules introduce computational and modeling challenges. We introduce meth...

متن کامل

Towards Bidirectional Hierarchical Representations for Attention-based Neural Machine Translation

This paper proposes a hierarchical attentional neural translation model which focuses on enhancing source-side hierarchical representations by covering both local and global semantic information using a bidirectional tree-based encoder. To maximize the predictive likelihood of target words, a weighted variant of an attention mechanism is used to balance the attentive information between lexical...

متن کامل

CCG augmented hierarchical phrase-based machine translation

We present a method to incorporate target-language syntax in the form of Combinatory Categorial Grammar in the Hierarchical Phrase-Based MT system. We adopt the approach followed by Syntax Augmented Machine Translation (SAMT) to attach syntactic categories to nonterminals in hierarchical rules, but instead of using constituent grammar, we take advantage of the rich syntactic information and fle...

متن کامل

NTT statistical machine translation for IWSLT 2006

We present the NTT translation system that is experimented for the evaluation campaign of “International Workshop on Spoken Language Translation (IWSLT).” The system consists of two primary components: a hierarchical phrase-based statistical machine translation system and a reranking system. The former is conceptualized as a synchronous-CFG in which phrases are hierarchically combined using non...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015